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Abstract 

In this paper we present a nonparametric method for extending functional regression 
methodology to the situation where more than one functional covariate is used to predict 
a functional response. Borrowing the idea from Kadri et al. (2010a), the method, which 
support mixed discrete and continuous explanatory variables, is based on estimating 
a function-valued function in reproducing kernel Hilbert spaces by virtue of positive 
operator-valued kernels. 



1. Introduction 

The analysis of interaction effects between continuous variables in multiple regression 
has received a significant amount of attention from the research community. In recent 
years, a large part of research has been focused on functional regression where continuous 
data are represented by real-valued functions rather than by discrete, finite dimensional 
vectors. This is often the case in functional data analysis (FDA) when observed data 
have been measured over a densely sampled grid. We refer the reader to Ramsay and 
Silverman (2002, 2005) and Ferraty and Vieu (2006) for more details on functional data 
analysis of densely sampled data or fully observed trajectories. In this context, various 
functional regression models (Ramsay and Silverman, 2005) have been proposed accord- 
ing to the nature of explanatory (or covariate) and response variables, perhaps the most 
widely studied is the generalized functional linear model where covariates are functions 
and responses are scalars (Cardot et al., 1999,2003; James, 2002; Miiller and Stadtmiillcr, 
2005; Preda, 2007). 



In this paper, we are interested in the case of regression models with a functional 
response. Two subcategories of such models have appeared in the FDA literature: co- 
variates are scalars and responses are functions also known as "functional response 
model" (Faraway, 1997; Chiou et al., 2004); both covariates and responses are func- 
tions (Ramsay and Dalzell, 1991; He et al., 2000; Cuevas et al., 2002; Prchal and Sarda, 
2007; Antoch et al., 2008). In this work, we pay particular attention to this latter 
situation which corresponds to extending multivariate linear regression model to the 
functional case where all the components involved in the model are functions. Unlike 
most of previous works which consider only one functional covariate variable, we wish to 
perform a regression analysis in which multiple functional covariates are used to predict 
a functional response. The methodology which is concerned with solving such task is 
referred to as a multiple functional regression. 

Previous studies on multiple functional regression (Han et al., 2007; Matsui et al., 
2009; Valderrama et al., 2010) assume a linear relationship between functional covari- 
ates and responses and model this relationship via a multiple functional linear regression 
model which generalizes the model in Ramsay and Dalzell (1991) to deal with more 
than one covariate variable. However, extensions to nonparametric models have not 
been considered. Nonparametric functional regression (Ferraty and Vieu, 2002,2003) is 
addressed mostly in the context of functional covariates and scalar responses. More re- 
cently, Lian (2007) and Kadri et al. (2010a) showed how function-valued reproducing 
kernel Hilbert spaces (RKHS) and operator-valued kernels can be used for the nonpara- 
metric estimation of the regression function when both covariates and responses are 
curves. Building on these works, we present in this paper a nonparametric multiple func- 
tional regression method where several functions would serve as predictors. Furthermore, 
we aim at extending this method to handle mixed discrete and functional explanatory 
variables. This should be helpful for situations where a subset of regressors are comprised 
of repeated observations of an outcome variable and the remaining are independent scalar 
or categorical variables. In Antoch et al. (2008) for example, the authors discuss the use 
of a functional linear regression model with a functional response to predict electricity 
consumption and mention that including the knowledge of special events such as festive 
days in the estimation procedure may improve the prediction. 

The remainder of this paper is organized as follows. Section 2 reviews the multiple 
functional linear regression model and discusses its nonparametric extension. This section 
also describes the RKHS-based estimation procedure for the nonparametric multiple 
functional regression model. Section 3 concludes the paper. 

2. Multiple functional regression 

Before presenting our nonparametric multiple function regression procedure, we start 
this section with a brief overview of the multiple functional linear regression model (Mat- 
sui et al., 2009; Valderrama et al., 2010). This model extends functional linear regression 
with a functional response (Ramsay and Dalzell, 1991; Ramsay and Silverman, 2005) to 
deal with more than one covariate and seeks to explain a functional response variable 
y(t) by several functional covariates Xk(s). A multiple functional linear regression model 



is formulated as follows: 



y l (t) = a(t) + x ik (s)Pk(s,t)ds + £,(*), t £ It, i = l,...,n, (1) 

fe=i 

where a(t) is the mean function, p is the number of functional covariates, n is the 
number of observations, /3fc(s,i) is the regression function for the fc-th covariate and 
€i(t) a random error function. To estimate the functional parameters of this model, one 
can consider the centered covariate and response variables to eliminate the functional 
intercept a. Then, /3fc(., •) are approximated by a linear combination of basis functions 
and the corresponding real-valued basis coefficients can be estimated by minimizing a 
penalized least square criterion. Good candidates for the basis functions include the 
Fourier basis (Ramsay and Silverman, 2005) and the B-spline basis (Prchal and Sarda, 
2007). 

It is well known that parametric models suffer from the restriction that the input- 
output relationship has to be specified a priori. By allowing the data to model the 
relationships among variables, nonparametric models have emerged as a powerful ap- 
proach for addressing this problem. In this context and from functional input-output 
data ( Xi (s), yi (t))2 =1 e (G x)^ x Qy where Q x '• Is ^ ^ and Qy : It > M , a nonpara- 
metric multiple functional regression model can be defined as follows: 

Vi(t) = fi x i( s )) + <k(t)i s € I s , t G I t , i = l,...,n, 

where / is a linear operator which perform the mapping between two spaces of functions. 
In this work, we consider a slightly modified model in which covariates could be a mixture 
of discrete and continuous variables. More precisely, we consider the following model 

Vi(t) = f( Xi ) + ei(t), i = l,...,n, (2) 

where Xi E X is composed of two subsets xf and x°(s). xf £ R fe is a k x 1 vector 
of discrete dependent or independent variables and x%(s) is a vector of p continuous 
functions, so each Xi contains fc discrete values and p functional variables. 

Our main interest in this paper is to design an efficient estimation procedure of the 
regression parameter / of the model ((2|). An estimate /* of / e T can be obtained by 
minimizing the following regularized empirical risk 

n 

f * = argmin^T \\ Vi - f{xi% + X\\f\\% (3) 

Borrowing the idea from Kadri et al. (2010a), we use function- valued reproducing kernel 
Hilbert spaces (RKHS) and operator-valued kernels to solve this minimization problem. 
Function-valued RKHS theory is the extension of the scalar-valued case to the func- 
tional response setting. In this context, Hilbert spaces of function- valued functions are 
constructed and basic properties of real RKHS are restated. Some examples of poten- 
tial applications of these spaces can be found in Kadri et al. (2010b) and in the area 
of multi-task learning (discrete outputs) see Evgeniou et al. (2005). Function- valued 
RKHS theory is based on the one-to-one correspondence between reproducing kernel 



Hilbcrt spaces of function-valued functions and positive operator-valued kernels. We 
start by recalling some basic properties of such Spaces. We say that a Hilbert space T 
of functions X — > Q y has the reproducing property, if Vx £ X the evaluation functional 
/ — > f(x) is continuous. This continuity is equivalent to the continuity of the mapping 
/ — > (/( ;r )i5)e y f° r an Y x S X and g £ Q y . By the Riesz representation theorem it 
follows that for a given x £ X and for any choice of g £ Q y , there exists an element 
h% £ T, s.t. 

V/e J" (hl,f)r = (f(x),g) Sv 

We can therefore define the corresponding operator- valued kernel If (., .) £ C(Q ), where 
£-{G y ) denote the set of bounded linear operators from Q y to Q y , such that 

(K(x,z) gi ,g 2 )g y = {K\hf) T 

It follows that (h^. 1 (z), g 2 )g y = (h~£,h 9 z 2 )jr = (K{x,z)gi,g2)g and thus we obtain the 
reproducing property 

(K(x,.)g,f) r = {f(x),g)g v 

It is easy to see that K (x, z) is a positive kernel as defined below: 
Definition: We say that K{x,z), satisfying K(x,z) — K(z,x)*, is a positive operator- 
valued kernel if given an arbitrary finite set of points {(xi, <7i)}i=i,... n € X x Q y , the 
corresponding block matrix K with K^j — Xj)gi, 9j)g y is positive semi-definite. 

Importantly, the converse is also true. Any positive operator-valued kernel K (x, z) 
gives rise to an RKHS J-k, which can be constructed by considering the space of function- 
valued functions / having the form /(.) = Yn=i K(xi, and taking completion with 
respect to the inner product given by (K(x, .)gi,K(z, ■)g2)j r = {K(x, z)g\,g-i)g . 

The functional version of the Representer Theorem can be used to show that the 
solution of the minimization problem ([3]) is of the following form: 

n 

f*(x)=J2 K &Xi)9i (4) 

Substituting this form in ([3]) , we arrive at the following minimization over the scalar- 
valued functions gi rather than the function-valued function / 

n n n 

min ■ /Y^,\\yi-^K{xi,x 5 )gj\\% y + X^2(K(x i ,x j )g i ,g j )g y (5) 

t=l 3=1 1,3 

This problem can be solved by choosing a suitable operator-valued kernel. Choosing 
K presents two major difficulties: we need to construct a function from an adequate 
operator, and which takes as arguments variables composed of scalars and functions. 
Lian (2007) considered the identity operator, while in Kadri et al. (2010) the authors 
showed that it will be more useful to choose other operators than identity that are able 
to take into account functional properties of the input and output spaces. They also 
introduced a functional extension of the Gaussian kernel based on the multiplication 
operator. Using this operator, their approach can be seen as a nonlinear extension of 
the functional linear concurrent model (Ramsay and Silverman, 2005). Motivated by 



extending the functional linear regression model with functional response, we consider 
in this work a kernel K constructed from the integral operator and having the following 
form: 

(K(x i> x j )g)(t) = [k x d(xf,x^) + k x c(xf,x c j )] J k v (s,t)g(s)ds (6) 

where k x d and k x a are scalar- valued kernels on R fe and (Q x ) p respectively and k y the 
reproducing kernel of the space Q y . Choosing k x d and k y is not a problem. Among 
the large number of possible classical kernels k x d and k y , we chose the Gaussian kernel. 
However, constructing k x c is slightly more delicate. One can use the inner product 
in (Gx) p to construct a linear kernel. Also, extending real- valued functional kernels such 
as those in Rossi et Villa. (2006) to multiple functional inputs could be possible. 

To solve the problem ([5]) , we consider that Q y is a real- valued RKHS and k y its repro- 
ducing kernel and then each function in this space can be approximated by a finite linear 
combination of kernels. So, the functions can be approximated by a uky(th •) 

and solving ([5]) returns to finding the corresponding real variables an . Under this frame- 
work and using matrix formulation, we find that the nm x 1 vector a satisfies the system 
of linear equation 

(K + XI)a = Y (7) 

where the nm x 1 vector Y is obtained by concatenating the columns of the matrix 
i<n, l <.m and K is the block operator kernel matrix (Kij)i<^j< r i where each Ky is 
a to x to matrix. 

3. Conclusion 

We study the problem of multiple functional regression where several functional ex- 
planatory variables are used to predict a functional response. Using function- valued 
RKHS theory, we have proposed a nonparametric estimation procedure which support 
mixed discrete and continuous covariates. In future, we will illustrate our approach and 
evaluate its performance by experiments on simulated and real data. 
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